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Abstract 

The adaptive rejection sampling (ARS) algorithm is a universal random gen¬ 
erator for drawing samples efficiently from a univariate log-concave target 
probability density function (pdf). ARS generates independent samples from 
the target via rejection sampling with high acceptance rates. Indeed, ARS 
yields a sequence of proposal functions that converge toward the target pdf, 
so that the probability of accepting a sample approaches one. However, sam¬ 
pling from the proposal pdf becomes more computational demanding each 
time it is updated. In this work, we propose a novel ARS scheme, called 
Cheap Adaptive Rejection Sampling (CARS), where the computational ef¬ 
fort for drawing from the proposal remains constant, decided in advance by 
the user. For generating a large number of desired samples, CARS is faster 
than ARS. 

Keywords: Monte Carlo methods. Rejection Sampling, Adaptive Rejection 
Sampling 


1. Introduction 


Random variate generation is required in different fields and several ap- 


plications ( 

Devroye, 1986 

Hermann et al.| 

2003 

Robert and Casella, 

2004). 

Rejection sampling (RS) ( 

Robert and Casella 

2004, Chapter 2) is a universal 


sampling method which generates independent samples from a target prob¬ 
ability density function (pdf). The sample is either accepted or rejected by 
an adequate test of the ratio of the two pdfs. However, RS needs to establish 
analytically a bound for the ratio of the target and proposal densities. 
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Given a target density, the adaptive rejection sampling (ARS) method 


(Gilks and Wild, 1992; Gilks, 1992) produces jointly both a suitable proposal 


pdf and the upper bound for the ratio of the target density over this proposal. 
Moreover, the main advantage of ARS is that ensures high acceptance rates, 
since ARS yields a sequence of proposal functions that actually converge 
toward the target pdf when the procedure is iterated. The construction of the 
proposal pdf is obtained by a non-parametric procedure using a set of support 
points (nodes), with increasing cardinality. When a sample is rejected in the 
RS test, it is added to the set of support points. One limitation of ARS is that 


Martino et a 


reason, several extensions have been proposed ( 

Hermann 

1995 

|Hirose and 

A.Todoroki, 

2005[ Evans and Swartz, 1998 Goriir and Tehj |2011[ |Martino 

and Miguez 

2011), even mixing with MGMG techniques ( 

Gilks et al. 

19951 


2013, 2015a). A related RS-type method, automatic but non- 


adaptive, that employs a piecewise constant construction of the proposal 
density obtained with a pruning of the initial nodes, has been suggested in 
(Martino et ah, 2015b). 

In this work, we focus on the computational cost required by ARS. The 
ARS algorithm obtains high acceptance rates improving the proposal func¬ 
tion, which becomes closer and closer to target function. Hence, this en¬ 
hancement of the acceptance rate is obtained building more complex propos¬ 
als, which become more computational demanding. The overall time of ARS 
depends on both the acceptance rate and the time required for sampling from 
the proposal pdf. The computational cost of ARS remains bounded since the 
probability of updating the proposal pdf, Pt, vanishes to zero as the num¬ 
ber of iterations t grows. However, for a hnite t, there is always a positive 
probability Pt > 0 of improving the proposal function, producing an increase 
of the acceptance rate. This enhancement of the acceptance rate could not 
balance out the increase of the time required for drawing from the new up¬ 
dated proposal function. Namely, if the acceptance rate is enough close to 1, 
a further improvement of the proposal function could become prejudicial. 

Thus, we propose a novel ARS scheme, called Gheap Adaptive Rejection 
Sampling (GARS), employing always a hxed number of nodes, i.e., the com- 


^The possibility of applying ARS for drawing for multivariate densities depends on the 
ability of constructing a sequence of non-parametric proposal pdfs in higher dimensions. 
See, for instance, the piecewise constant construction in (Martino et al., 2015aI as a simpler 
alternative procedure. 
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putational effort required for sampling from the proposal remains constant, 
selected in advance by the user. The new technique is able to increase the 
acceptance rate on-line in the same fashion of the standard ARS method, 
improving adaptively the location of the support points. The conhguration 
of the nodes converges to the best possible distribution which maximizes the 
acceptance rate achievable with a hxed number of support points. Clearly, 
the maximum obtainable acceptance rate with CARS is always smaller than 

1, in general. However, for large value of required samples, the CARS algo¬ 
rithm is faster than ARS for generating independent samples from the target, 
as shown the numerical simulations. 

2. Adaptive Rejection Sampling 

We denote the target density as 

7t(x) = — 7i(x) = — exp (R(a;)), a: G A C M, 

Ctt 

with 7i{x)dx. The adaptive proposal pdf is denoted as 

qt{x) = —qt{x) = — exp {Wt{x)), 

^TT 

where q = qt{x)dx and t eN. In order to apply rejection sampling (RS), 
it is necessary to build qt{x) as an envelope function of 7i{x), i.e., 

qt{x) >7r{x), or Wt{x)>V{x), (1) 

for all a; G A and t eN. As a consequence, it is important to observe that 

ct > c^, Vt G N. (2) 

Let us assume that V{x) = log 7i{x) is concave, and we are able to evaluate 
the function V{x) and its hrst derivative R'(a;)|^ The adaptive rejection 


^The evaluation of V'{x) is not strictly necessary, since the function qt{x) can also 
construct using a derivative-free procedure (e.g., see (Gilks 1992) or the piecewise con¬ 


stant construction in (Martino et al. 2015al). For the sake of simplicity, we consider the 
construction involving tangent lines. 
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sampling (ARS) technique (Gilks, 1992 Gilks and Wild, 1992) considers a 


set of support points at the t-th iteration, 


Sf "{Si, S2j ■ • • 5 C ^y 

such that Si < ... < Smt and rut = |5t|, for constructing the envelope 
function qt{x) in a non-parametric way. We denote as Wi{x) as the straight 
line tangent to V{x) at s* for z = 1,... ,mt. Thus, we can build a piecewise 
linear function, 

Wt{x) =mm[wi{x),... ,Wmt{x)], xeX. (3) 

Hence, the proposal pdf dehned as qt{x) oc qt{x) = exp(lW(a;)), is formed by 
exponential pieces in such a way that Wt{x) > V{x), so that qt{x) > n{x), 
when V{x) is concave (i.e., 7r(x) is log-concave). Figure [^depicts an example 
of piecewise linear function Wt{x) built with m* = 3 support points. 



Figure 1: Example of construction of the piecewise linear function Wt{x) with mt = 3 
support points, such that Wt{x) > V{x). 

Table summarizes the ARS algorithm for drawing N independent sam¬ 
ples from 7r{x). At each iteration t, a sample x' is drawn from qt{x) and 
accepted with probability otherwise is rejected. Note that a new point 
is added to the support set ot whenever it is rejected in the RS test improv¬ 
ing the construction of qt{x). Glearly, denoting as T the total number of 
iterations of the algorithm, we have always T > N since several samples are 
discarded. 
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Table 1: Adaptive Rejection Sampling (ARS) algorithm. 

Initialization: 


1. Set t = 0 and n = 0. Choose an initial set Sq = {si,..., Smo}- 

Iterations (while n < N): 

2. Bnild the proposal qt{x), given the set of snpport points St = 

{si }, according to Eq. g. 

3. Draw x' ~ qt{x) (x qt{x) and u' i/(|0,l|). 

4. If u' > then reject x', npdate 

= St'S 

and set t t + 1. Go back to step 2. 

5. If m' < , then accept x', setting Xn = x'. 

6. Set St+i t = t + l, n = n + l and retnrn to step 2. 

Outputs: The N accepted samples Xi,... ,xn- 

3. Computational cost of ARS 

The computational cost of an ARS-type method depends on two elements: 

1. The averaged number of accepted samples, i.e., the acceptance rate. 

2. The computational effort required for sampling from qt{x). 

We desire that the acceptance rate is close to 1 and, simultaneously, that the 
spent time required for drawing from qt{x) is small. In general, there exists 
a trade-off since an increase of the acceptance rate requires the use of a more 
complicated proposal density qt{x). ARS is an automatic procedure which 
provides a possible compromise. Below, we analyze some important features 
of a standard ARS scheme. 
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3.1. Acceptance rate 

The averaged number of accepted samples, i.e., the acceptance rate, is 

^ I’ 

that is 0 < r/t < 1 since q > c^, Vt G N, by construction. Dehning the Li 
distance between nt{x) and p{x) as 

D{qt,7r) = \\qt{x)-7r{x)\\i= \qt{x) - 7r{x)\dx, (5) 

J X 

ARS ensures that D{qt,n) —)■ 0 when t —)■ cx), and as a consequence Ct ^ c^. 
Thus, Tjt tends to one as t —)■ cx). Indeed, as rjt —t 1, ARS becomes virtually 
an exact sampler after a some iterations. 

3.2. Drawing from the proposal pdf 

Let us denote the exponential pieces as 

h,{x) = e^^^^\ * = l,...,iV, (6) 


so that 


qt{x) = hi{x), for a; G X* = (ei_i, e*], i = l,...,N, 


where e* is the intersection point between the straight lines Wi{x) and Wi+i{x), 
for i = 2,..., iV — 1, and cq = —oo and cn = +C )0 (if A = M). Thus, for 
drawing a sample x' from qt{x) = ^qt{x), we need to: 

1. Compute analytically the area A* below each exponential piece, i.e., 
Ai = hi{x)dx and obtain the normalized weights 


Pi 




( 7 ) 


where we have observed that q = X]n=i = Xr qt{x)dx. 

2. Select an index j* (namely, one piece) according to the probability mass 
Pi,i = l,...,N. 
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3. Draw x' from hj*{x) restricted within the domain Xj* = (ej*_i, Cj*], and 
zero outside (i.e., from a truncated exponential pdf). 

Observe that, at step 2, a multinomial sampling is required. It is clear that 
the computational cost for drawing one sample from qt{x) increases as the 
number of pieces grows or, equivalently, the number of support points grows. 
Fortunately, the computational cost in ARS is automatically controlled by 
the algorithm, since the probability of adding a new support point 

Pt = l- r]t= -D{qt,7r), ( 8 ) 

Ct 

tends to zero as f —)• oo, since the distance in Eq. ([^ vanishes to zero, i.e., 

D{qt,Tr) -)■ 0 . 

4. ARS with fixed number of support points 

We have seen that the probability of adding a new support point Pt van¬ 
ishes to zero as f —>■ oo. However, for a hnite t, we have always a positive 
probability P* > 0 of adding a new point (although small), so that a new 
support point could be incorporated producing an increase of the acceptance 
rate. After a certain iteration r, i.e., t > r, this improvement of the ac¬ 
ceptance rate could not balance out the increase of the time required for 
drawing from the proposal, due to the addition of the new point. Namely, 
if the acceptance rate is enough close to 1, a further addition of a support 
point could slow down the algorithm, becoming prejudicial. 

In this work, we provide an alternative adaptive procedure for ARS, called 
Cheap Adaptive Rejection Sampling (CARS), which uses a hxed number of 
support points. When a sample is rejected, a test for swapping the rejected 
sample with the closest support point within St is performed, so that the total 
number of points remains constant. Unlike in the standard ARS method, in 
the new adaptive scheme the test is deterministic. The underlying idea is 
based on the following observation. The standard ARS algorithm yields 
a decreasing sequence of normalizing constants of the proposal pdf 

converging to = f^7r(x)dx, i.e., 

Cq P Cl ... P Ct ■■■ P C(X) (9) 

Clearly, since the acceptance rate is rjt = ^ this means that —)■ 1. In 
CARS, we provide an alternative way for producing this decreasing sequence 
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of normalizing constants {q}. Indeed, an exchange between two points is 
accepted if it produces a reduction in the normalizing constant of the corre¬ 
sponding proposal pdf. More specihcally, consider the set 

<St = {si, S2, ■ ■ ■, Sm}, 

contained M support points. When a sample x' is rejected in the RS test, 
the closest support point s* in St is obtained, i.e., 

s* = arg min Isj — x'l. 

Si^St 

We recall that we denote with qt{x) the proposal pdf built using St and with 
Ct its normalizing constant. Then, we consider a new set 

e; = 5iU{x'}\K}, (10) 

namely, including x' and removing s*. We denote with g{x) the proposal 
built using the alternative set of support points Q, and Cg = f^g(x)dx. If 

Cg Q, 

then the swap is accepted, i.e., we set 5^+1 = Q for the next iteration, oth¬ 
erwise the set remains unchanged, St+i = St- The complete algorithm is 
outlined in Table Note that q is always computed (in any case, for both 
ARS and CARS) at the step 3, for sampling from qt(x). Furthermore ob¬ 
serve that, after the hrst iteration, step 2 can be skipped since the new 
proposal pdf qt+i(x) has been already constructed in the previous iteration, 
i.e., qt+i(x) = qt(x), or at step 4.3, i.e., qt+i(x) = g(x). 

Therefore, with the CARS algorithm, we obtain again a decreasing se¬ 
quence of {Ct}t6N 

Cq ^ Cl ... Ct ’ ’ ’ ^ Cqo , 

but Coo 7^ Ctt so that r^oo < 1, in general. The value poo is the highest 
acceptance rate that can be obtained with M support points, given the target 
function 7r(x). Therefore, CARS yields a sequence of sets iSi, ... ,St,... that 
converges to the stationary set S^o containing the best conhguration of M 
support points for maximizing the acceptance rate, when the target function 
is 7r{x) and given a specihc construction procedure for the proposal q't(a;)|^ 


^The best configuration S^o depends on the specific construction procedure employed 
for building the sequence of proposal functions qi,q 2 ■ ■ ■ ,qt, ■ ■ ■ 



Table 2: Cheap Adaptive Rejection Sampling (CARS) algorithm. 

Initialization: 


1. Set t = 0 and n = 0. Choose a value M and an initial set Sq = 
{si,..., Sm}- 

Iterations (while n < N): 

2. Build the proposal qt{x), given the current set St, according to Eq. 
([^ or other suitable procedures. 

3. Draw x' ~ qt{x) oc qt{x) and u' i/(|0,l|). 

4. If u' > then reject x' and: 

4.1 Find the closest point s* in St, 


s* = arg min \si — x' 

Si^St 


4.2 Build the alternative proposal g{x) based on the set of points 


g = StU{x'}\{s*} 


and compute Cg = f^g(x)dx. 

4.3 If Cg < Ct, set St+i = g, otherwise, if Cg > Ct, set = St. 
Set t = t + 1 and go back to step 2. 


5. If m' < , then accept x', setting Xn = x'. 


6. Set St+i =St, t = t + l, n = n + l and return to step 2. 

Outputs: The N accepted samples xi,..., xn- 

5. Numerical simulations 

We consider a Gaussian density as (typical) log-concave target pdf and 
test both ARS and CARS. Namely, we consider 
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with = |. We compare ARS and CARS in terms of the time reqnired 
for generating N G {5000,1000} samples. In all cases and both techniqnes, 
we consider a initial set of snpport points 5o = {si,..., sq} with cardinality 
mo = |i5o| G {3,5,10} (clearly, M = ttiq in CARS) where the initial points 
are chosen nniformly in [—2,2] at each simnlation, i.e., Sj ~ M([-2,2])g 

We rnn 500 independent simnlations for each case and compnte the re¬ 
qnired time for generating N samples (nsing a Matlab code), the averaged 
nnmber of final snpport points (denote as E[mT\) and the acceptance rate 
reached in the hnal iteration (denoted as E[r]T\] averaged over the 500 rnns), 
for both techniqnes. Table shows the resnlts. The time is normalized with 
respect to the time spent by ARS with N = 5000, |5o| = 3. The resnlts show 
that CARS is always faster than ARS. We can observe that both methods 
obtain acceptance rates close to 1. CARS reaches acceptance rates always 
greater of 0.87 nsing only 3 nodes. CARS obtains an more than 0.98 employ¬ 
ing only 10 nodes and after generating N = 5000 independent samples. Fig. 
[^depicts the wasted time, the hnal acceptance rate and the hnal nnmber of 
nodes, as fnnction of nnmber N of generated samples. We can observe that 
CARS is signihcantly faster than ARS when N grows, owing to ARS yields 
a sensible increase of the nnmber of snpport points that corresponds to an 
inhnitesimal increase of the acceptance rate, whereas CARS the nnmber of 
nodes remains constant. Fignre shows a seqnence of proposal pdfs con- 
strncted by CARS, using 3 nodes and starting with Sq = {—1.5, —1,1.8}. 
The Li distance D{qt,7i) is reduced progressively and the acceptance rate 
improved. The hnal set of support point is St = { — 1.0261, —0.0173,1.0305}, 
close to the optimal one = { — 1, 0,1}. 
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Figure 2: (a) Spent time, (b) final acceptance rate, and (c) final number of support 
points, as function of the number N of drawn samples, for ARS (squares) and CARS 
(triangles). 
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Proposal after generating N=10000 samples (ts N) 



(c) 


Figure 3: Example of sequence of proposal pdfs obtained by CARS, starting with Sq = 
{ —1.5, —1,1.8}. We can observe that the Li distance D{qt,n) is reduced progressively. 
The proposal function qt{x) is depicted with dashed line, the target function Tr(x) with 
solid line and the support points with circles. The configuration of the nodes in figure (c) 
is St = {-1.0261,-0.0173,1.0305} with t > N = 10^. The optimal configuration with 3 
nodes and Tr{x) = exp {—x"^) is Soo = (—1,0,1}. 
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